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On the Spectral Properties of Matrices Associated with Trend Filters 

Abstract This paper is concerned with the spectral properties of matrices associated with linear 
filters for the estimation of the underlying trend of a time series. The interest lies in the fact that the 
eigenvectors can be interpreted as the latent components of any time series that the filter smooths 
through the corresponding eigenvalues. A difficulty arises because matrices associated with trend 
filters are finite approximations of Toeplitz operators and therefore very little is known about their 
eigenstructure, which also depends on the boundary conditions or, equivalently, on the filters for 
trend estimation at the end of the sample. 

Assuming reflecting boundary conditions, we derive a time series decomposition in terms of pe- 
riodic latent components and corresponding smoothing eigenvalues. This decomposition depends 
on the local polynomial regression estimator chosen for the interior. Otherwise, the eigenvalue 
distribution is derived with an approximation measured by the size of the perturbation that differ- 
ent boundary conditions apport to the eigenvalues of matrices belonging to algebras with known 
spectral properties, such as the Circulant or the Cosine. The analytical form of the eigenvectors is 
then derived with an approximation that involves the extremes only. 

A further topic investigated in the paper concerns a strategy for a filter design in the time 
domain. Based on cut-off eigenvalues, new estimators are derived, that are less variable and almost 
equally biased as the original estimator, based on all the eigenvalues. Empirical examples illustrate 
the effectiveness of the method. 

Keywords Smoothing, Toeplitz matrices, Spectral analysis, Boundary conditions, Matrix al- 
gebras, Approximate asymmetric filters, Bias- Variance trade-off. 



1 Introduction 

The smoothing problem has a long and well established tradition in statistics and has a wide range 
of applications in time series analysis; see Anderson (1971, ch. 3), Kendall (1973), Kendall, Stuart 
and Old (1983) and Cleveland and Loader (1996). In its simplest form, it aims at providing a mea- 
sure of the underlying tendency from noisy observations, and takes the name of signal extraction 
in engineering, trend estimation in econometrics, and graduation in actuarial sciences. This paper 
is concerned with local polynomial regression methods, that developed as an extension of least 
squares regression and result in estimates that are linear combinations of the available informa- 
tion. These linear combinations are often termed filters and their analysis provides useful insight 
into what the method does. 

The properties of linear filters are traditionally studied on different, complementary viewpoints. 
In the time domain, the analysis of the filter weights provides information on the amount of bias 
introduced and variance left in the input data from the smoothing procedure. In the frequency 
domain, the basic assumption is that a time series can be decomposed as a linear combination 
of trigonometric functions. The variability and the dependence relation among the variables are 
then evaluated in terms of the contribution of such components with respect to some frequency or 
periodicity, usually measured in radians. 

An alternative approach consists of analysing the matrices associated with linear filters. Though 
smoothers have been introduced in a time series framework, with the works of Whittaker (1923) 
on spline smoothing and of Henderson (1916, 1924) on graduation by averages, they have been 
mainly analysed in the context of linear regression and in generalised additive models, follow- 
ing the approach of Buja, Hastie and Tibshirani (1989, section 2) and Hastie and Tibshirani (1990, 
section 3.7), based on the smoother matrices associated with linear estimators. In these references, 
the attention is concentrated on symmetric matrices that arise as the solutions of penalised least 
squares problems, such as the cubic smoothing spline estimators (see Whaba, 1990, and Green 
and Silverman, 1994). The spectral properties of smoother matrices are analysed and inferential 
procedures based on eigenvalues and eigenvectors are developed. The authors remark that eigen- 
analysis is no longer useful for non symmetric smoother matrices because of complex eigenvalues 
and eigenvectors and argue that the spectral analysis of a smoother matrix is closely related to the 
study of the transfer function of the associated linear filter for time series. 

These two remarks motivated the present paper. In considering local polynomial regression 
methods for the estimation of the underlying trend of a time series, symmetry is in general lost 
and replaced by centrosymmetry. At the same way, the interpretation that can be ascribed to the 
eigenvalues and eigenvectors of time series smoothing matrices (let us suppose for the moment 
that we are capable to lead the problem to the real or to the symmetric case) provides useful in- 
formation on the estimation method. In fact, the eigenvectors of matrices associated with local 
polynomial regression estimators can be interpreted as the latent components of any time series 



that the filter smooths through the corresponding eigenvalues. This interpretation allows a decom- 
position of a time series in periodic latent components that depend on the estimation method and 
opens the way to eigenvalue-based inferential procedures. Furthermore, it is possible to establish 
a formal connection between the spectrum of a smoothing matrices and the transfer function of 
the associated filter. 

This paper analyses the spectral properties of matrices associated with trend filters. In referring 
to spectral properties in a time series setting, we shall distinguish between two accomplished the- 
ories: the spectral analysis of a linear filter, where the filter properties are studied in the frequency 
domain, and the spectral properties of the associated matrix, i.e. the study of its eigenvalues and 
eigenvectors. Both these techniques are related to the concept of spectrum, to be intended as 
a latent characteristic that cannot be directly observed. The spectral properties of linear filters 
have been widely investigated in time series analysis, where classical references are the books by 
Jenkins and Watts (1968), Priestley (1981), Bloomfield (2000). On the other hand, the spectral 
properties of the associated matrices have not been explored. One reason is certainly due to the 
lack of attention surrounding time series smoothing matrices. Another justification relies on the 
fact that the mathematics of these matrices is rather problematical. In fact, they can be interpreted 
as finite approximations of infinite symmetric banded Toeplitz operators. The latter have been 
extensively explored, but their finite counterparts subject to boundary conditions are much more 
difficult to analyse (see Bottcher and Grudsky, 2005; see also Gray, 2006). Established results hold 
for tridiagonal matrices, but when the span of the filter increases, the algebra becomes extremely 
complicated and, except for some cases, only approximate results can be obtained. The size of 
the approximation essentially depends on the boundary conditions on the finite matrix. Further- 
more, the boundary conditions determine the asymmetric filters for the estimation of the trend at 
the extremes of the series. Specifically, two-sided symmetric filters cannot be applied since future 
(or past) observations are not available. It should be remarked that the estimates at the end of the 
sample are crucial in current analysis. 

We derive approximate results on the eigenvalues and eigenvectors of matrices associated with 
trend filters by interpreting the latter as perturbations of matrices belonging to the circulant and to 
the reflecting algebras, for which eigenvalues and eigenvectors can be known exactly even in finite 
dimensions. The underlying hypothesis is that of a circular and of a reflecting process, respec- 
tively. The key result is a perturbation theorem that draws some conclusions on the distribution 
of the eigenvalues of the original smoothing matrices. We then relate the absolute eigenvalue 
distribution to the gain function of the corresponding symmetric filter. To illustrate these results, 
we consider a class of asymmetric filters that approximate a given symmetric estimator with a 
minimum mean square revision error strategy, subject to polynomial constraints. This class en- 
compasses the local polynomial regression filters that automatically adapt at the boundaries and 
that under mild assumptions on the trend are unbiased estimators. Concerning the eigenvectors, 
we show that filters that are unbiased with respect to polynomial trends of order p have p + 1 



eigenvectors that describe polynomial functions up to the degree p. The analytical form of the 
remaining eigenvectors is derived with an approximation which involves the extremes only. A 
further topic investigated in the paper concerns a strategy for a filter design in the time domain. 
Based on cut-off eigenvalues, it is possible to obtain new estimators that, in the interior, have less 
variance and almost equal bias than the original estimator. The effectiveness of this method is 
illustrated with empirical examples. We would like to remark that even if these results are derived 
in a time series setting, they apply to any non symmetric banded smoother matrix. 

The paper is organised as follows. Section 2 reviews the derivation of linear smoothers for 
trend extraction, both in the interior and at the boundaries (section 2.1), providing examples that 
will be used for the applications of the methods developed later on in the paper. In section 3, time 
series smoothing matrices are introduced and their properties are illustrated. Section 4 contains 
the major results of the paper, i.e. the spectrum analysis of matrices associated with trend filters. 
Specifically, two sets of boundary conditions are are considered, circulant (section 4.2) and re- 
flecting (section 4.3). Furthermore, we provide the interpretation of the eigenvectors as analytical 
periodic functions of the time. In section 5, a strategy for a filter design based on a selected number 
of latent components is derived, based on a suitably chosen cut-off eigenvalue. The bias- variance 
trade off between old and new estimators is evaluated (section 5.1) and the new filters are applied 
to real data (section 5.2). Section 6 summarises and comments on the results. Proofs and other 
technical details are given in section 7. 

2 Local polynomial regression methods 

Time series analysis is often based on additive models like 

y t = m + e t ,t = l,...,n, (1) 

where yt is the observed time series, fi t is the trend component, also termed the signal, and et 
is the noise, or irregular, component. The signal \x t can be a random or deterministic smooth 
function of time whereas the most common assumption for the noise e t is that it follows a zero 
mean stochastic process, such as a White Noise or/and Gaussian. Let us assume that in (1) fi t is an 
unknown deterministic function of time, so that E(y t ) = fi t , and that equally spaced observations 
yt+j,j = 0, ±1, 2, . . . , h, are available in a neighbourhood of time t. Our interest lies in estimating 
the level of the trend at time t, fit, using the available observations. If fi t is differentiable, using 
the Taylor-series expansion it can be locally approximated by a polynomial of degree p of the time 
distance, j, between y t and the neighbouring observations yt+j- Hence, fit+j ~ mt+j, with 

m t+j = p + p!J + ■ ■ ■ + P p f, j = 0, ±1, . . . , ±h. 

The degree of the polynomial is crucial in determining the accuracy of the approximation. Another 
essential quantity is the size h of the neighbourhood around time £; for £ = h + l,...,n — h + 



1, the neighbourhood consists of 1h + 1 consecutive and regularly spaced time points at which 
observations y t+ j are made. At the boundaries, asymmetric neighborhood will be considered. The 
parameter h is the bandwidth, for which we assume p < 2h throughout. 
Replacing fi t +j by its approximation gives the local polynomial model: 



vt+j = ^2 @ k i k + e *+j' i = °> ±:L ' • • • ' ±h - 



(2) 



fc=0 



Assuming that e t+ j ~ NID(0, a 2 ), then (2) is a linear Gaussian regression model with explanatory 
variables given by the powers of the time distance j k , k = 0, . . . ,p and unknown coefficients (5j~, 
which are proportional to the £>th order derivatives of fi t - Working with the linear Gaussian 
approximating model, we are faced with the problem of estimating m t = Pq, i.e. the value of the 
approximating polynomial for j = 0, which is the intercept (3q of the approximating polynomial. 
The model (2) can be rewritten in matrix notation as follows: 



y = X/3 + e, e~N(0,a 2 I) 
where y = [y t -h, ■■■ , Vu ■ ■ ■ , Vt+h]', e = [e t -h, ■■• , e t , ■ ■ ■ , et+h]', 
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Provided that p < 2/t, the p + 1 unknown coefficients Pk,k = 0, . . . ,p, can be estimated by 
the method of weighted least squares which consists of minimising with respect to the P^'s the 
objective function: 



s(p ,...,p p ) 



h 

E 



k j ( vt+j - A) - Pij - Pij' 



P P f 



where Kj > is a set of weights that define, either explicitly or implicitly, a kernel function. In 
general, kernels are chosen to be symmetric and non increasing functions of j, in order to weight 
the observations differently according to their distance from time t; in particular, larger weight may 
be assigned to the observations that are closer to t. As a result, the influence of each individual 
observation is controlled not only by the bandwidth h but also by the kernel. Defining K = 
diag(re/ l , . . . ,Ki,Ko,K\, . . . , Kh), the WLS estimate of the coefficients is (3 = (X / KX) _1 X / Ky. 



In order to obtain rht = (3q, we need to select the first element of the vector (3. Hence, denoting 
by ei the p + 1 vector e[ = [1,0,..., 0], 

h 

rh t = e'J = e / 1 (X / KX)- 1 X'Ky = w'y = ^ WjVt^j, 

j=-h 

which expresses the estimate of the trend as a linear combination of the observations with coeffi- 
cients 

w' = ei(X'KX) _1 X'K. (4) 

The linear combination yielding the trend estimate is the local polynomial two-sided filter. It 
satisfies X'w = ei. As a consequence, the filter w is said to preserve a deterministic polynomial 
of order p. Moreover, the filter weights are symmetric (w^ = w_A which follows from the 
symmetry of the kernel weights Kj , and the assumption that the available observations are equally 
spaced. 

As an example that we shall adopt in the following, we consider the Henderson filter (Hen- 
derson, 1916; see also Kenny and Durbin, 1982, Loader, 1999, Ladiray and Quenneville, 2001) 
that arises as the weighted least squares estimator of a local cubic trend at time t using the kernel 
Kj = [(h + l) 2 — j 2 ][(h + 2) 2 — j 2 ][(h + 3) 2 — j 2 ]. These weights minimise the variance of 
the third differences of the estimated trend (maximum smoothness criterion), subject to the cubic 
reproducing property. 

2.1 Asymmetric filters for the estimation at the boundaries 

The derivation of the two-sided symmetric filter has assumed the availability of 2h+ 1 observations 
centred at t. Obviously, for a given finite sequence yt, t = 1, . . . , n, it is not possible to obtain the 
estimates of the signal for the (first and) last h time points, which is inconvenient, since we are 
typically most interested at the most recent estimates. 

We can envisage three fundamental approaches to the estimation of the signal at the extremes 
of the sample period: 

1. the construction of asymmetric filters that result from fitting a local polynomial to the avail- 
able observations y t , t = n — h + 1, n — h + 2, . . . , n; 

2. the application of the symmetric two sided filter w to the series extended by h forecasts 
y n +i\n, l = l,...,h, (and backcasts j/i_ f | n ); 

3. the derivation of the asymmetric filter which minimises the revision mean square error sub- 
ject to polynomial reproducing constraints. 

The trend estimates for the last h data points, m n _ h+ i\ n , . . . , rh n \ n , use respectively 2h, 2h — 
1, . . . , h + 1 observations. It is thus inevitable that the last h estimates of the trend will be subject 



to revision as new observations become available. In the sequel we shall denote by q the number of 
future observations available at time t (the period which our estimate is referred to), q = 0, . . . , h, 
and by m t \ t+q the estimate of the signal at time t using the information available up to time t + q, 
with < q < h; rh t \ t is usually known as the real time estimate since it uses only the past and 
current information. 

We now review the first strategy, which results from the automatic adaptation of the local 
polynomial filter to the available sample; we then interpret the results in terms of the other two 
strategies. The approximate model y t+ j = m t +j + et+j is assumed to hold for j = —h, —h + 
1, . . . , q, and the estimators of the coefficients /3&, k = 0, . . . , d, minimise 
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Let us partition the matrices X, K and the vector y as follows: 
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where y p denotes the set of available observations, whereas yj is missing and X and K are 
partitioned accordingly. The local polynomial regression (LPR) filters arising as the solution to 
the above weighted least squares problem are written in matrix notation as: 



w a = K p X p (x;K p X p )- 1 e 1 . 



Equivalently 
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w p + K p Xp(X' KpX„) 1 X' / 
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(6) 
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that is obtained by partitioning the two-sided symmetric filter in two groups, w - [w , w,j 
where w p contains the weights attributed to the past and current observations and wj those at- 
tached to the future unavailable observations. The proof of (6) can be found in Proietti and Luati 
(2007), where detailed proofs of other results that will be used in this section, such as (8), are 
also available. Equation (6) represents the fundamental relationship which states how the asym- 
metric LPR filter weights are obtained from the symmetric ones. Premultiplying both sides by 
X' we can see that the asymmetric filter weights satisfy the polynomial reproduction constraints 
X' w a = X' w p + X'rWj = X'w. Thus, the bias in estimating an unknown function of time has 
the same order of magnitude as in the interior of time support. 

The filter resulting from the automatic adaptation of the local polynomial fit can be equivalently 
derived using the second strategy, assuming that the future observations are generated according 
to a polynomial function of time of degree p, so that the optimal forecasts are generated by the 
same polynomial model. 

The third strategy consists in determining the asymmetric filter v minimising the mean square 
revision error subject to constraints. Let us rewrite the regression model (3) as 



y = U7 + Z<5 + e,e~N(0,D), 



where we have partitioned the columns of the design matrix X = [U|Z], in order to separate 
the polynomial constraints imposed to the filter from those assumed for the trend. Specifically, 
the constraints are specified as follows: U'v = U'w, where U = [U' U'J'. Writing D = 
diag(D p , Dj), the set of asymmetric weights minimises with respect to v the following objective 
function 

p(v) = (v - w p )'D p (v - w p ) + W f I>fWf + [S'(Z' p \ - Z'w)] 2 + 2l'(U' p v - U'w). (7) 

The revision error arising in estimating the signal m t is rh t u — rht = v'y p — w'y. Replacing 
y p = U P 7 + Z P S + e p , and y = U7 + Z<5 + e, and using U' v = U'w = 0, we obtain 
rh t \ t — m t = (v'Zp — w'Z)5 + v'ej, — w'e, where e = [e' e'J'. Hence, the first three summands of 
(7) represent the mean square revision error, which is broken down into the revision error variance 
(the first two terms) and the squared bias term [<$'(Z'v — Z'w)] . The vector ! isa vector of 
Lagrange multipliers. The solution is 

v = w p + IAJ' f Wf + MZ p SS'Z' f w f , (8) 

with 

M = Q- 1 - Q^Up^Q^Ly-^Q- 1 , L = Q^UpfUj.Q^Up]- 1 . 

The matrices M and L have the following properties: U p M = 0, U p L = I. It should be noticed 
that the LPR filters arise in the case D = K -1 and U = X, so that the bias term is zero. 

The merits of the class of filters (8), relative to the LPR asymmetric filters, lie in the bias- 
variance trade-off. In particular, the bias can be sacrificed for improving the variance properties of 
the corresponding asymmetric filter. 

3 Matrices associated with local polynomial regression estimators 

Any linear operator acting on an n-dimensional time series y to produce smooth estimates of the 
underlying trend can be represented in matrix form as 

Sy = rh 

where S is the n x n smoothing matrix representative of a weighted average to be applied to the 
observations in moving manner and y is, from now on, the n— dimensional vector containing all 
the observations. In practice, S can be constructed as the matrix canonically associated with the 
linear transformation s, so that its columns contain the coordinates of the s-transformed vectors of 
the canonical basis £ = {ei, ei, . . . e n }, where e^ is the vector with all zeros except for the j-th 
element, equal to one, taken with respect to the canonical basis itself, i.e. S = [s(ei)g • • • s(e n )g]. 
The rows of S, denoted by w£, are the filters and the generic element w t j is the weight to 
be assigned to the observation yj to get the estimated value rh t . The weights are null outside a 
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bandwidth whose length, a function of /i, depends on the local estimation method. In general, 
the n — 2h central values are estimated by applying 2h + 1 symmetric weights to consecutive 
observations centred in t whereas the first and last h trend estimates are obtained by applying 
asymmetric filters of variable length to the available observations at the boundaries of the series. 
Thus it follows that S is a banded matrix with the following structure 



(9) 



where S s is the submatrix whose rows are the symmetric filters, while S a and S a * contain the 
asymmetric filters to be applied to the first and last observations, respectively; the number into 
parentheses indicate the dimension of the submatrices. 

S is centrosymmetric, in that w tj = Wn+i-t n +i_j-; S s is rectangular centrosymmetric, whereas 
S a and S a *, are one t-transform of one another, where t is a linear transformation that consists in 
the pre- and post-multiplication of a matrix by the exchange matrix E& G M. kxk having ones on 
the cross diagonal (bottom left to top right) and zeros elsewhere (Dagum and Luati, 2004). For ex- 
ample, t(S a *) = E^S a E2/i- Centrosymmetric matrices are invariant with respect to t and preserve 
their structure under matrix multiplication, thus allowing the convolution of linear filters to be a 
linear filter as well. On the other hand, they are in general not symmetric, with the consequence 
that their eigenvalues and eigenvectors are complex. In dealing with real data, such as time series, 
this is inconvenient. Moreover, very little is known about the analytical form of such quantities, 
except that eigenvectors are either symmetric or skew symmetric (Weaver, 1985), i.e. invariant or 
equal to their opposite if premultiplied by E n . For symmetric matrices, some results can be found 
in Cantoni and Butler (1976) and Makhoul (1981). 

The rest of the paper deals with the spectral analysis of matrices like S. In the next section, we 
will define the problem and review some asymptotic results that hold in the ideal case of doubly 
infinite samples. Then, the main results on the eigenvalues and eigenvectors in finite dimension 
will be derived. 

4 Spectral analysis 

The scalar A is an eigenvalue of S if there exists a non null vector x such that Sx = Ax and x is 
the eigenvector of S corresponding to A. If we could virtually take an infinite time series and apply 
the two-sided symmetric filter to all the observations, then we would have an infinite smoothing 
matrix structured like a symmetric banded Toeplitz (SBT), with real eigenvalues and eigenvectors. 
Let us suppose that the eigenvalues can be ordered in a numerable decreasing sequence, Ai > 
A2 > ••• > A n > .... Hence, the eigenvectors xi,X2, ...,x n , ..., can be interpreted as time series 
that the filter expands, Aj > 1, leaves unchanged, A« = 1, shrinks, Aj < 1, or suppresses, Aj = 



0, i = 1, 2, ..., n, ... . We may ask how do these series behave and how are they modified by the 
corresponding eigenvalues. 

Because of their symmetric or skew symmetric nature, the eigenvectors are likely to be inter- 
preted as polynomials or as periodic components. Thus, since we are dealing with matrices asso- 
ciated with trend filters, what we expect is that low frequency components associated with smooth 
variations of the underlying process are represented by long period eigenvectors and associated 
with eigenvalues close to unity. On the other hand, we expect that high frequency components as- 
sociated with erratic fluctuations will be represented by short period eigenvectors associated with 
eigenvalues close to zero. Hence the eigenvectors of S can be interpreted as the periodic latent 
components of any time series, modified by the filter through multiplication by the corresponding 
eigenvalues. In fact, let us consider the linear combination 

y = a-ixi + a 2 X2 + ... + a„x n + ... 

then 

k oo 

Sy = ^ ^ a i X i + ^2 ^ a * X *' 
i=l i=k+l 

where the a depend on the series y , in that they re-scale the amplitude of each periodic component, 
and the A depend on the smoothing matrix S, i.e. on the filter. It follows that, independently of 
the a, there will be k components that the filter leaves unchanged or smoothly shrinks, and these 
account for the signal, and oo — k components that will be almost suppressed, and these account 
for the noise. 

The choice of k turns out to be a filter design problem in time domain. There is a mathemati- 
cally elegant exact solution, which occurs if rank(S) = k that is rh belongs to the column space 
C(S) and e lies in the null space M(S). In practice, even if many of the eigenvalues are close to 
zero, S is full rank and therefore we may only look for an approximate solution that consists of 
choosing a cut-off time or a cut-off eigenvalue. To do this, it is necessary to know the analytical 
form, at least with some approximations or restrictions, of the eigenvalues and eigenvectors of S. 

4.1 Infinite dimension 

In the ideal case of a doubly infinite sample, the matrix S is a SBT operator whose non null 
elements are the Fourier coefficients of the trigonometric polynomial (the symbol of the matrix, 
see Grenander and Szego, 1958) 

h 

H(y) = ]T Wd e wd 

d=-h 

and 

1 _ n - 1 f^ n 

lim-VA^— / H(u)du 
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with 

Ai < max H(i/), X n > minff(^). 

H(u) is the transfer function of the filter evaluated at the frequency v, expressed in radians. The 
fundamental eigenvalue distribution theorem states that the spectrum of an infinite SBT matrix 
is dense on the set of values that the transfer function of the symmetric filter can assume and no 
revisions or phase shifts intervene in the estimation process. 

In finite dimension, the analytical form of eigenvalues and eigenvectors is known only for 
few classes of matrices, which are the tridiagonal SBT and matrices belonging to some algebras, 
namely the Circulant, the Hartley and the generalised Tau. All these matrix algebras are associated 
with discrete transforms such as, respectively, the Fourier, the Hartley and the various versions of 
the Sine or Cosine; see, respectively, Davis (1979), Bini and Favati (1993), Bozzo and Di Fiore 
(1995) and the survey paper by Kailath and Sayed (1995). In our setting, any algebra undertakes 
different hypotheses on the future behaviour of the series. Interpreting a smoothing matrix as the 
sum of a matrix belonging to one of these algebras plus a perturbation occurring at the boundaries, 
approximate results on the eigenvalues of S can be derived. The size of the perturbation depends on 
the matrix algebra and on the boundary conditions. In the following, we consider the circulant and 
the reflecting algebras as well as asymmetric filters that approximate a given two sided symmetric 
filter according to a minimum mean square revision error criterion subject to constraints. 

4.2 Circular boundary conditions 

The circularity assumption, that is the future behaviour of the process is equal to its initial path, 
represents the ideal situation when the transfer function of any asymmetric filter is equal to that 
of the symmetric filter and no phase shifts affect the process, like in the infinite case. However, 
the circularity assumption has the limitation of being restrictive in the presence of nonstationary 
trends. 

In the sequel, given a two sided symmetric filter {w_/j, ..., wo, ..., w^}, we will denote by S 
the n x n associated smoothing matrix, with boundary conditions determined by approximate 
asymmetric filters, and by W the corresponding circulant matrix (Davis, 1979) structured like a 
finite SBT plus circular corrections in the top-right and bottom-left corners, 

h -1 

d=0 d=-h 

where C is the circulant matrix (basis) whose first row is the re-dimensional vector [0, 1, 0, 0, ..., 0]. 
Note that W is symmetric, as follows by the symmetry of the filter weights. For a square matrix 
A we will denote its spectrum by <r(A) and its 2-norm by || A||2 = \/p(A'A) where p(A) is the 
spectral radius of A, which is the maximum modulus of its eigenvalues. With this preliminary 
notation, we are able to state the following result on the eigenvalues of a trend filter matrix. The 
proof is in the appendix. 
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Figure 1: Left. Transfer function of the symmetric Henderson filter, h = 6, v G [0, n] (line) and 
eigenvalues of the associated circulant matrix W (dots), n = 51. Right. Eigenvalue distribu- 
tions of W (dots) with asymmetric Musgrave-LC (squares), QL (circles), CQ (stars), LPR filters 
(pluses) filters, absolute values. 






Theorem 1 Let S be an n x n smoothing matrix associated with the symmetric filter {w_/ t , 
..., Wo, ...,W/j}, n > 2h, and let W be the corresponding circulant matrix. Hence, VA £ c(S), 
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The theorem provides an upper bound on the size of the perturbation of the eigenvalues of S 
with respect to those of W, for which an exact analytical expression is available. The quantity 
5w measures how much the eigenvalue distribution of a smoothing matrix moves away from that 
of the corresponding circulant. On their turn, the eigenvalues of the circulant matrix result to be 
distributed over the transfer function of the symmetric filter, as the left panel of figure 1 shows. 
What follows is that 5\y can be chosen as a measure of how much the eigenvalue distribution of S 
deviates from the transfer function of the associated filter. In the next section, we will show that 
the discrete approximation of H{v) through the points in <r(W) can be improved by assuming the 
hypothesis of reflecting behaviour of the process at the end of the sample. As we will see, this 
occurs because n-dimensional filtering matrices subject to reflecting boundary conditions have n 
distinct eigenvalues, whereas circulant matrices have pairwise coincident eigenvalues, i.e. | or 
^Y- + 1 distinct eigenvalues, for n even or odd, respectively. 
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To illustrate, we consider the symmetric 13-term Henderson filter introduced in section 2 and, 
as an approximation at the boundaries, the LPR estimators and the following asymmetric filters 
based on a minimum mean square revision error strategy, subject to polynomial constraints: 

Linear trend - Constant fit (LC): the asymmetric LC filters arise as the best approximations to 
the two-sided Henderson filter assuming that y t is linear and imposing the constraint that 
the weights sum to 1. Hence U = i, the unit vector. This class contains the well-known 
Musgrave (1964) surrogate filters that are commonly used to approximate the Henderson 
filters. 

Quadratic trend - Linear fit (QL): the asymmetric QL filters arise as the best approximations to 
the two-sided Henderson filter assuming that y t is quadratic and imposing the constraint that 
the estimates are capable of reproducing a first degree polynomial. Hence U is made of the 
first two columns of X whereas Z contains the third column of X. 

Cubic trend - Quadratic fit (CQ): the asymmetric CQ filters arise as the best approximations to 
the two-sided Henderson filter assuming that y t is a cubic function of time and imposing the 
constraint that the estimates are capable of reproducing a second degree polynomial. Hence 
U is made of the first three columns of X whereas Z contains the fourth column of X. 

Except for the LPR filters, all of the asymmetric filters are derived here for fixed values of the 
parameters they depend upon, i.e. 6 2 /a 2 , r = 1, 2, 3 for LC, QL and CQ respectively, and are 
posed equal to the value that gives the Musgrave filter approximating the 13-term Henderson 
filter, i.e. 6\ja 2 = 4/(3. 5 2 7r). The parameters 61,62, 6s represent the slope, curvature, and 
relative inflexion of the trend. 

The results are the following: the size of the perturbation is minimum for the Musgrave-LC 
filters, being 6\y = 0.5835, and maximum for the LPR filters, for which 6\y = 1.0047. In 
the middle, the asymmetric QL, 6\y = 0.8641 and CQ, 6\y = 0.9876. As a consequence, the 
eigenvalue distributions turn out to be slight translations (towards the right) of the absolute transfer 
function (the gain) of the symmetric filter: this implies an increase in the overall variance of the 
estimated trend, the increase being greater as long as 6\y increases, as the right panel of figure 1 
shows. 

The size of the perturbation does not depend on n, in that the n — 2h central rows of the matrix 
S — W are all null. On the other hand, it is highly influenced by the real time filter (last row of 
S), applied to estimate the trend at time t using the available observations up to and including t. 
The fact is that, in general, there is a discontinuity in the behaviour of the real time filter with 
respect to the preceding asymmetric ones, due to the rapid increase of the leverage of the filter, 
i.e. the weight attached to the observation taken at the same time we are estimating the trend, as 
long as the span of the filter decreases. The leverage further tends to increase (up to unity) with 
high degrees of the fitting polynomial (for a formal proof, see Proietti and Luati, 2007). Here, we 
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verify this phenomenon by choosing as smoothing matrix the circulant matrix with first and last 
rows replaced by any real time filter of the class introduced above. The resulting values of Syy 
are almost identical to those obtained when the smoothing matrices with the whole asymmetric 
filters were considered: for Musgrave-LC it is 6w = 0.5247, for QL it is 5yy = 0.8024, for CQ 
it is 5w = 0.9393, for the LPR filters it is 5w = 0.9547. Conversely, all the values of Sw result 
greater than 0.95 provided that the first and last row of S are replaced by the real time LPR filters, 
whose leverage is close to one. 

Another factor that highly affects the size of the perturbation (and the overall variance of the 
trend estimates) is the algebraic multiplicity of the eigenvalue A = 1, that we now show to de- 
pend on the degree of the polynomial that the filter is capable of reproducing. The p— th degree 
polynomial reproduction constraints met in section 2 can be written as 



(w t ,i) = l, (w t> dj} = Vt = l,2,..,n 



(10) 



where w' t is the t-th row of S, i is an n-dimensional vector of ones and d r = [(— q) r , {—q + 
l) r , ..., (n — q — l) r ]' , with q = t — 1, for t = 1, ...,n,r = 1, ...,p. As an example, consider a 
polynomial trend fi t = ao + «i + «| + ••• + a p an d a symmetric filter {w„/j, ..., wo, ..., w/J. Then 
h = J2d=-h™d[ao + a x (t + d) + a 2 (t + df + ... + a p (t + df] = [i t if J2d=-h w d = 1 and 
Yld=-h d r ^d = for r = 1, 2, . . . , p. The conditions (10) imply that 

where x r is the vector whose t-th coordinate is ao + a± + a^i 2 + ... + a p t r , that means that i 
and x r , r = 1, 2, ...,p, are eigenvectors of S corresponding to the eigenvalue A = 1 of algebraic 
multiplicity equal to p + 1. It is therefore evident that the greater the algebraic multiplicity of the 
eigenvalue equal to one, the greater the displacement between the gain function of the filter (equal 
to one for v = only) and the absolute eigenvalue distribution. 



4.3 Reflecting boundary conditions 

Besides the class of circulant matrices, another class of matrices with known spectral properties 
even in finite dimension is the t^ algebra (Bozzo and Di Fiore, 1995), that is associated with 
different versions of the Sine and Cosine trasnforms and constitutes a generalisation of the r 
family (Bini and Capovani, 1983). An n x n matrix H belongs to the t^ class if and only if 
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and ip,ip = 0,1,-1. The elements hy of the matrices in t.^ v satisfy the cross sum property 
hj-ij + hj + ij = hjj_i + hjj+i subject to boundary conditions determined by ip and (p. For the 
original r algebra arising when ip = ip = the boundary conditions are hoj = hjo = h n +ij = 
hj n+ i = 0, i,j = 1, ...,n and all the matrices in r can be then derived given their first row 
elements. Still based on the first row of H but more appropriate for our purposes, since it allows 
to obtain the eigenvalues and eigenvectors of H in r^„ in an amenable form, is the following 
way to construct H as a linear combination of powers of T^ (see Bini and Capovani, 1983, 
Proposition 2.2). Let h' = [hi,h2, ...,h n ] be the first row of H. Then 

n 

where c is the solution of the upper triangular system Qc = h and Q is the matrix whose j'-th 
column equals the first column of T , . It follows that the eigenvalues of H are given by 






en) 



where $j, i = 1, .., n, are the eigenvalues of T^ !ip . The eigenvectors of H are the same of T^. 

Let us consider the reflecting hypothesis such that the first missing observation is replaced by 
the last available observation, the second missing observation is replaced by the previous to the 
last observation and so on, that for a two-sided 2h + 1-term estimator conesponds to the real time 
filter {wfr, v/h-i + w/i, ..., wi + W2, wo + wi}, made of h + 1 terms. With the constraint of being 
centrosymmetric, the reflecting matrix H belongs to the r\\ algebra and its first row is the vector 



h' = [w + wi,wi + w 2 ,w 2 +w 3 ,...,w /l „ 1 +Wfc,w h ,0, ...,0] . 



(12) 



With these premises, we are able to construct H G t\\ coiTesponding to the symmetric filter 
{w_/j, ..., wo, wi, ..., w/j} and to derive the following result where, for sake of notation, we use 
the Pochhammer symbol (j) q = j(j + l)(j + 2)...(j +q — 1), for q = 0,1, ..., ~ 3 ^ , the latter 
term denoting the largest integer less than or equal to —f— . 

Theorem 2 Let S be annx n smoothing matrix associated with the symmetric filter {w_^, ..., 
wq, ..., Wft}, and let H be the corresponding matrix in t\\. Hence, VA G c(S), 3i G {1, 2, .., n} 



such that 



where 
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The proof is in section 7. As by-product, theorem 2 gives the eigenvalues of H G rn, with first 
row equal to (12), as an explicit function of the filter weights, as shown in (13). The corresponding 
eigenvectors are known (Bozzo and Di Fiore, 1995) and given by 



A'; 



(2j-l)(i-l)7T 

cos 

In 



,j = l,2,...,n (14) 

3 



with ki = -4= for i = 1 and fe, = 1 for i > 1. The inferential procedures that will be introduced in 
the following section are based on the eigenvalues and eigenvectors given by (13) and (14), respec- 
tively. In the sequel, we discuss the merit of assuming reflecting rather than circulant boundary 
conditions, i.e. of basing the inference on theorem 2 rather then on theorem 1. 

Indeed, there are several advantages in adopting the approximation for S given by H G t\\ 
instead of the circulant approximation provided by W. First, all the operators belonging to r 
algebras have real eigenvalues and eigenvectors. All the computations related to this class can 
therefore be done in real arithmetic. Secondly, the reflecting hypothesis undertaken by the t\\ 
algebra is more appropriate than that of a circular process when the signal is a non stationary 
function of time, as is the case when we are interested in its estimate. It should be reminded that 
the estimation methods considered so far are local, so that the boundary conditions only concern 
a neighborhood of the ending observations. For fixed bandwidth methods this means that only 
h,h + l,...,2h observations are involved in the asymmetric filtering; if a nearest neighbourhood 
approach is followed, then 2h + 1 observations will be weighted even at the extremes of the se- 
ries. Another aspect that deserves to be remarked on concerns the absolute size of the perturbation 
(an overestimate of the true distance about eigenvalues), which is smaller for reflecting than for 
circulant boundary conditions, i.e. du < 5w- In fact, in general, Circulant-to-Toeplitz corrections 
produce perturbations that are not smaller than Tau-to-Toeplitz corrections, since while H is struc- 
tured as (9), the circulant W has nonzero corrections in the top right and bottom left hx h blocks. 
When the matrix elements are the same, this results in a greater perturbation. Table 1 illustrates 
this property for the class of approximate filters considered before. 

Table 1: Values of 5 for h = 6, t\\ and circulant algebras, approximate asymmetric filters. 

LC QL CQ LPR 

5 H (Reflecting) 0.1608 0.3817 0.7493 0.8351 
S w (Circulant) 0.5835 0.8641 0.9876 1.0047 

Finally, as we anticipated in the preceding subsection, the main aspect concerning the rn 
approximation is that H has n distinct eigenvalues compared to the at most ^-^- + 1 of W, compare 
the left panel of figure 2 with left panel of figure 1. What follows is that the eigenvalue distribution 
of any smoothing matrix S having the form of (9) can be approximated by that of the corresponding 
H G rn, having the same submatrix-structure, with a deviation smaller than 5h- The same 
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Figure 2: Left. Transfer function of the symmetric Henderson filter, h = 6, v G [0, n] (line) 
and eigenvalues of the associated reflecting matrix H (crosses), n = 51. Right. Gain function of 
the symmetric Henderson filter, h = 6 (line) and eigenvalue distributions of S with asymmetric 
Musgrave-LC (squares), QL (circles), CQ (stars), LPR filters (pluses) filters. 





order deviation occurs between the eigenvalue distribution of S and the transfer function of the 
corresponding symmetric filter, as showed in the right panel of figure 2. 

To conclude our discussion on the eigenvalues of S, we remark that their complex part is 
merely generated by the finite approximation and not related to the phase that in general affects 
the asymmetric filters. This can be easily understood by means of a counterexample: the matrix 
associated with a cubic smoothing spline (see Whaba, 1990, and Green and Silverman, 1994) is 
symmetric, so that its eigenvalues are real even if the asymmetric filters do produce phase shifts. 

We now consider the eigenvectors. In the preceding section we have proven that if the filter 
reproduces a polynomial of order p, then there exist p + 1 eigenvectors, associated with the eigen- 
value A = 1, that describe a constant (r = 0), linear (r = 1), quadratic (r = 2), cubic (r = 3) and 
so on up to a p-th order polynomial function of the time. In general, the analytical expression of 
the eigenvectors of a smoothing matrix cannot be derived using the perturbation theory, not even 
in an approximate form. However, evaluating the action of S on the eigenvectors of H, we are able 
to show that, unless for the boundaries, the latent components of S can be fairly approximated by 
those of H. In fact, let us decompose the time series y as a linear combination of the n known real 
and orthogonal latent components represented by the eigenvectors of H, 

y = 6 1 z 1 + 6> 2 z 2 + ... + n z n 

where the z.; are given by (14) and = [6\, ..., 8 n ]' is a vector of coefficients. It follows from 
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Figure 3: Coordinates of the first four eigenvectors Zj, i = 1, 2, 3, 4 of H (crosses) and of (I + 
^h)z« (circles) plotted against t = 1, 2, ...n, n = 13, /i = 6, symmetric Henderson filter. 
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where Ajjz^ is a vector of zeros except for the first and last h coordinates, i.e. 



A h zj 





E/ t z* 



and z* 



£!=i (Si 



Hij)zij for g = /i + 1, ...,2/i and z = 1,2, ...,/i. Due to the fact that 



the elements of both S and H add up to one and their absolute values are in general smaller than 
one, the values in z* and in E^z| are almost zero. This holds not only for n S> h, which is 
the case when we usually apply local niters, but also for n close to h, as figure 3 illustrates in the 
limiting case of n = 2h+l, where the approximation concerns the maximum number of boundary 
approximations, namely n — 1. 

5 Filter design in the time domain 

The results of the preceding sections are applied for a filter design in time domain. The aim is 
to obtain estimates with smaller variance and almost equal bias than those produced by S. The 
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method consists of modifying S so that n — k high frequency noisy components that the filter is 
not capable of eliminating are given zero weight. This is done through the spectral decomposition 
of H. The choice of k i.e. of the cut-off eigenvalue ^ will be discussed later in this section. 

Decomposing S = H + Ah and H = TiXTJ , where X = diag{£i,£2i ■•• ) £n}> an d writing 
y = Z6, we get 

Sy = Z^0 + A H Z0 
« ZX k O + A H Z0 

where X k is the matrix obtained by replacing with zeros the eigenvalues of H that are smaller 
than a cut-off eigenvalue £& and AhZ# is a null vector except for the first and last elements that 
account for the boundary conditions. Turning to the original coordinate system and arranging the 
boundaries, we get the new estimator 

S fe = H fc + A fc + A H 
= H( fc ) + A H 

where Hft) * s tne matrix with boundaries equal to those of H and interior equal to that of H& = 
ZAfcZ'. In other words, H (fe) is structured like (9) with H£ = H a , H£* = t(H a ) and U s k = 
[ZX^Z'Y . Hence a new smoothing matrix is obtained, S&, and consequently new trend estimates, 
say m fc . 

In practice, the procedure is much easier to apply. In fact, given a symmetric filter, it consists 
of: obtaining H, replacing it by H^ and then adjusting the boundaries with suitable chosen asym- 
metric filters to get Sfe. Besides simplicity and variance improvement in the interior, this procedure 
allows a full choice of the set of asymmetric weights. Indeed, in the examples we shall illustrate at 
the end of this section, due to the strong impact of the real time filter respect to all the asymmetric 
ones, we will replace only the last row of H^. 

5.1 Bias-variance trade off 

Let us assume that e ~ M(0,a 2 I). The variance of the estimates obtained by S is given by 
V(m) = V(Sy) = So- 2 IS'. It follows that 

V(m) - V(m k ) = a 2 [Z(X 2 - X%)Z' + (H - H fc )A' H + A H (H - H*)' - (H fc A' fe + A*H' fc )] 

where Z ( X 2 — X 2 ) Z' is the main contribution to the variance in the interior and is greater than zero 
in the sense of a positive definite matrix; the two summands left restitute a matrix with non null 
first and last 2h rows only, given that Ah and Afc have top left and bottom right nonzero blocks 
of dimension h x 2h. So, even if they mainly account for the variance at the boundaries, they also 
contribute to the variance in the interior. However, for h <C n the contribution is negligible with 
respect to that of the first summand and it is common to both rh and rhfc. 

The bias is given by B(m) = (X — E{xh) = fx — (H + A H )/^ = [I — (H + A h )]m- As 
introduced by the filtering procedure, the bias is smaller as long as S tends to the identity matrix 
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(in terms of the eigenvalues of S = I, there are n eigenvalues equal to one and therefore the filter 
is capable of reproducing an n-degree polynomial interpolating the data, i.e. the series itself). 
Comparing the bias of the two estimators we see that 

B(ui) - B(ui k ) = [Z{X k - X)Z' + A fc ] fi 

and so a measure of the discrepancy between the bias of rh^ and that of rh, in the interior, is 

1 1 n 

-tr{X k -X} = — V &. 

n n ^-^ 

i=n—k+l 

In general tr {X k — X} is a negative quantity that normalised by n is negligible, given that the 
last n — k eigenvalues ar almost zero, as follows by (13). 

The choice of A: is a further balancing of the trade-off between bias and variance of the filter. 
The trend in the interior is made smoother without sensibly increasing the bias. There are several 
options regarding how to choose k. One of them, which we shall adopt in our illustrations, consists 
of selecting k or equivalently £j. that minimises the distance of the eigenvalue distribution of H 
with that of the ideal low pass filter having first k eigenvalues equal to one and and last n — k equal 
to zero. In other words, we look for k such that 

f(k) = ||i( fc) -£|| 2 (15) 

is minimum, where in.) is an n dimensional vector with first k coordinates equal to one and the 
remaining equal to zero, whereas £ = [£i,£2, • •■>£«]' and the £j are given by (13). The function 
f(k) = E?=i(l-0 2 -^canbe written as f(k) = /(fc_l)+(l_£ fc )2_£2 = /(fc_l)+(l-2&) 
and therefore reaches its minimum for ^ = 0.5. This strategy is equivalent to finding the cut-off 
frequency that minimises the distance between the transfer functions of the symmetric filter and 
of the ideal low-pass filter 

f \I{u)-H{u)\ 2 dv 

J —IT 

where I(y) = 1 for v < —and I(v) = otherwise. The equivalence is based on the relation 
between time and frequency domain. In fact, for a fixed a cut-off frequency v = v k , the cut-off 
time k = ^- is obtained with a precision that increases as long as n is large. For instance, if 
we are given monthly data and are interested in removing 10-month cycles that can be wrongly 
interpreted as turning points of the trend curve, then we may replace by zeros all the eigenvalues 
smaller than £& with k = |jj . 

Finally, we would like to remark that whenever the interest is in the smoothness of the new 
estimator rather than in the exact value of k, a graphical inspection method may be appropriate. 
Having plotted the eigenvalue distribution, a suitable cut-off eigenvalue may be directly viewed. 
If the choice of k is not related to formal inferential procedure (e.g. restrictions on the bias) this 
method works well. 
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5.2 Empirical analysis 

In this section we provide illustrations of the eigenvalue-based method for reducing the variance 
of the trend estimates obtained with a given symmetric filter that is applied to real data. As a 
symmetric estimator, we consider the 13 term Henderson (1916) filter, which plays a prominent 
role in empirical applications, especially for trend estimation within the X-ll filter, which is an 
integral part of the X-12-ARIMA procedure, the official seasonal adjustment procedure in the 
U.S., Canada, the U.K. and many other countries. See Dagum (1980), Findley et al. (1998) and 
Ladiray and Quenneville (2001) for more details. As for the asymmetric filters, the reflecting 
have been chosen except for the case of the real time filter. In particular, the QL (Proietti and 
Luati, 2007) and Musgrave (1964) real time filters discussed in section 4.2 have been applied and 
compared. The smoothing matrix S is therefore equal to H except for the last row changed. To 
obtain Sfe we find the spectral decomposition of H and select the cut-off eigenvalue according to 
(15), i.e. & = 0.5. 

Our first illustration deals with the Italian index of industrial production. The top panel of figure 
4 represents the original series with the trend estimates rh (dotted line) and rh^ (continuous line). 
The gain in smoothness obtained using the latter estimator is not so evident in the whole series but 
can be clearly seen in the central panel of figure 4, where a subset of the data is represented. The 
estimates obtained by rh^ are less sensitive to the fluctuations of the series, note in particular the 
behaviour in the period ranging from June 1999 to June 2001 when the series is slightly increasing: 
the original filter estimates are sensible to noisy fluctuations that do not affect the modified version 
where highly noisy components are removed instead of just smoothed. The bottom panel shows 
the last year estimates to give an idea of a comparison among asymmetric filters: the Musgrave 
real time filter (dots) behaves almost like the reflecting one (dotted line) whereas the QL real time 
filter (continuous line) follows the series increase. 

Analog considerations apply to our second illustration, which concerns the series of retails of 
Euro area 4, see figure 5. This series is affected by an increase in variability even during periods of 
stationarity of the trend, as the top panel of the figure shows. The 13-term Henderson filter (dotted 
line) is known to be particularly reacting to short cycles that, if not smoothed enough, can be 
falsely interpreted as false turning points. The central part of figure 5 illustrates that the modified 
estimator (continuous line) where eigenvalues smaller than 0.5 are replaced by zeros produces 
smoother trend values without affecting the capability of catching true turning points, such as that 
occurred in November 1994. As in the previous case, in the bottom panel of figure 5 the last year 
estimates obtained with Musgrave (dots), the reflecting (dotted line) and the QL (continuous line) 
real time filters are illustrated. Even in this case, the QL reacts to the changes in the direction of 
the series more than the other two estimators which behave almost in the same manner. 
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Figure 4: Index of Industrial Production, Italy. Source: Istat. Top and center. Original series with 
estimates obtained by H (dotted line) and by H* (continuous line). Bottom. Real time estimates 
by QL (continuous line), reflecting (dotted line) and Musgrave (dots) real time filters. 




Jan1990-Dec2006, monthly data 




Jun1999-Dec2006, monthly data 




Jan2006-Dec2006, monthly data 
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Figure 5: Euro Area Industry, Retail Ea4. Source: European commission. Top and center. Original 
series with estimates obtained by H (dotted line) and by H* (continuous line). Bottom. Real time 
estimates by QL (continuous line), reflecting (dotted line) and Musgrave (dots) real time filters. 
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6 Concluding remarks 

This paper provided a decomposition of time series in periodic latent components that depends 
on the underlying trend estimation method. In particular, given a symmetric local polynomial 
regression estimator with reflecting boundary conditions, the latent components are given exactly 
by equation (14). These will be smoothed by an amount equal to (13). If different asymmetric 
filters for current trend estimation are adapted at the boundaries, then an approximation whose 
size was given in theorem 2 occurs in the eigenvalue distribution. 

Concerning the latter, it was shown in the paper that, in finite dimension, an approximated ver- 
sion of the fundamental eigenvalue distribution theorem holds. In fact, the eigenvalue distribution 
of a trend filter matrix turned out to be a discrete approximation of the the transfer function of the 
corresponding symmetric filter. Once again, the size of the approximation depends on boundary 
conditions. Circular and reflecting boundary conditions were illustrated and discussed. In any 
case, it emerged that as long as the locally weighted regression method is capable of reproducing 
a high degree polynomial trend, the approximation to the transfer function of the filter becomes 
worse, essentially due to the exploding behavior of the real time filter when the degree of the fitting 
polynomial increases. 

It followed that, as well as the transfer function, the eigenvalue distribution represents a mea- 
sure of the overall variance left in the input series by the smoothing procedure. More relevant, 
the decomposition in periodic latent component to which smoothing eigenvalues are associated 
constituted a framework for reducing the variance of the estimates. 

In fact, based on the analytical knowledge of the eigenvectors and eigenvalues, it was possible 
to improve the inferential properties of a given filter by annihilating noisy components that would 
have been otherwise only smoothed. The selection of a cut-off eigenvalue after which all the 
components received zero weight was discussed and new filters with smaller variance and almost 
equal bias to the original one were so derived. Applications to real data showed the variance 
improvement, especially for what concerns short cycles that may wrongly be interpreted as turning 
points of the trend-cycle. 

7 Appendix 

Proof of theorem 1 The matrix S can be written as S = W + Aw, where Aw = S — W. 
The circulant matrix W is diagonalised by the Fourier matrix 

n = -L[a,c*-i)«-i)] yj 

<n 



i,j = 1, ...,n, satisfying || S~2 1| 2 1| S~i 1 ||2 = 1, and its spectrum is cr(W) = {Ci, £2, •••) C™}> with 



d=0 d=-h 
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Setting 5y/ = ||Aw||2> the thesis follows from the Bauer-Fike perturbation theorem applied 
choosing the 2-norm as an absolute norm (Bauer and Fike, 1960). ■ 

Proof of theorem 2 Let us write S = H + Ah- The first part of the proof is analog to the 
proof of theorem 1 , provided that the matrix H is diagonalised by the orthogonal matrix 



kj cos 



(2Z-I)(j-1)7T 

2n 



,i,3 = 1,2, ...,n 



where kj = 4= for j = 1 and kj = 1 for j > 1 which satisfies ||Z||2||Z 1 ||2 = 1- The spectrum 
of Hiscr(H) = {£i,6,...,£n}, where 
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which follows by (11) and by the fact that the eigenvalues of Tn are (Bini and Capovani, 1983) 

(i - l)vr 



•&i = 2 cos 
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Setting 5h = || Ah||2 an d applying the Bauer-Fike theorem with the 2-norm as an absolute norm 
gives 

(i-l)vr s 






cos ■ 
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3-1 



<s H . 



We now prove that Cj = for j > h + 1, so that the above summation involves just h + 1 terms 
instead of n. It follows by the Cramer rule that, explicitly, 

_ detQ[j,h] 
Cj det Q 

where Q [j, h] is the matrix obtained replacing the j-th column of Q by the vector h. The matrix 
Q is upper triangular with ones on the diagonal so its its determinant is equal to one and since the 
generic element hj of h is null for j > h + 1 it follows that det Q [j, h] = and Cj will be null as 
well. 

Finally, we prove that 



c i =w,_ 1 + Y, (L /,, (j + 2g+l)w j+2g+1 . 

q=0 



(fl+i; 



(16) 



This expression can be directly verified by calculating det Q [j, h] for all j. Here in the following, 
we prove it by induction over j = 1, ..., h + 1, with h £ N. 

fc-2 

• For j = 1, c\ = wo + ^2 q =o (— 1) 9+1 2w2,j+2 which follows by (l) q = q\ and by simple 
algebra. The linear system Qc = h can be written as c = Q~ 1 (hi + h 2 ) with hi = 
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[wo, wi, ..., v/h,0, ...,0]' and h 2 = [wi, W2, ..., w^,0, ...,0]', both ?7,-dimensional vectors. 
Since the first row of Q _1 is the vector [1, — 1, — 1, 1, 1, — 1, — 1, ...] we have that c\ = 

(w + wi) - (wi + w 2 ) - (w 2 +w 3 ) + (w 2 + w 4 ) + ... + (-l)L^ _ J +1 2w 2 i ^-2 i +2 and 
therefore (16) holds for j = 1. 

For j = h, c/i = Wh-i as it is immediate to see given that the summation in q was defined 
for non negative values of ~ 3 2 ~ . All the more so, it implies that ch+i = w^. Hence we 
have showed that (16) holds for j = 1 and that, if it holds for j = h then it holds for 
j = h + 1. This proves that (16) is true for all h £ N. The proof of theorem 2 is therefore 
complete ■ 
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